22 research outputs found

    Towards bounding-box free panoptic segmentation

    Get PDF
    In this work we introduce a new bounding-box free network (BBFNet) for panoptic segmentation. Panoptic segmentation is an ideal problem for a bounding-box free approach as it already requires per-pixel semantic class labels. We use this observation to exploit class boundaries from an off-the-shelf semantic segmentation network and refine them to predict instance labels. Towards this goal BBFNet predicts coarse watershed levels and use it to detect large instance candidates where boundaries are well defined. For smaller instances, whose boundaries are less reliable, BBFNet also predicts instance centers by means of Hough voting followed by mean-shift to reliably detect small objects. A novel triplet loss network helps merging fragmented instances while refining boundary pixels. Our approach is distinct from previous works in panoptic segmentation that rely on a combination of a semantic segmentation network with a computationally costly instance segmentation network based on bounding boxes, such as Mask R-CNN, to guide the prediction of instance labels using a Mixture-of-Expert (MoE) approach. We benchmark our non-MoE method on Cityscapes and Microsoft COCO datasets and show competitive performance with other MoE based approaches while outperfroming exisiting non-proposal based approaches. We achieve this while been computationally more efficient in terms of number of parameters and FLOPs. Video results are provided here https://blog.slamcore.com/reducing-the-cost-of-understanding

    The application of KAZE features to the classification echocardiogram videos

    Get PDF
    In the computer vision field, both approaches of SIFT and SURF are prevalent in the extraction of scale-invariant points and have demonstrated a number of advantages. However, when they are applied to medical images with relevant low contrast between target structures and surrounding regions, these approaches lack the ability to distinguish salient features. Therefore, this research proposes a different approach by extracting feature points using the emerging method of KAZE. As such, to categorise a collection of video images of echocardiograms, KAZE feature points, coupled with three popular representation methods, are addressed in this paper, which includes the bag of words (BOW), sparse coding, and Fisher vector (FV). In comparison with the SIFT features represented using Sparse coding approach that gives 72% overall performance on the classification of eight viewpoints, KAZE feature integrated with either BOW, sparse coding or FV improves the performance significantly with the accuracy being 81.09%, 78.85% and 80.8% respectively. When it comes to distinguish only three primary view locations, 97.44% accuracy can be achieved when employing the approach of KAZE whereas 90% accuracy is realised while applying SIFT features

    Multi-resolution 3D mapping with explicit free space representation for fast and accurate mobile robot motion planning

    Get PDF
    With the aim of bridging the gap between high quality reconstruction and mobile robot motion planning, we propose an efficient system that leverages the concept of adaptive-resolution volumetric mapping, which naturally integrates with the hierarchical decomposition of space in an octree data structure. Instead of a Truncated Signed Distance Function (TSDF), we adopt mapping of occupancy probabilities in log-odds representation, which allows to represent both surfaces, as well as the entire free, i.e. observed space, as opposed to unobserved space. We introduce a method for choosing resolution -- on the fly -- in real-time by means of a multi-scale max-min pooling of the input depth image. The notion of explicit free space mapping paired with the spatial hierarchy in the data structure, as well as map resolution, allows for collision queries, as needed for robot motion planning, at unprecedented speed. We quantitatively evaluate mapping accuracy, memory, runtime performance, and planning performance showing improvements over the state of the art, particularly in cases requiring high resolution maps

    Solving double-sided puzzles: Automated assembly of torn-up banknotes evidence

    No full text
    Reconstruction of 2D object is a problem concerning many different fields such as forensics science, archiving, and banking. In the literature, it is considered as one-sided puzzle problem. But this study handles torn banknotes as a double-sided puzzle problem for the first time. In addition to that, a new dataset (ToB) is created for solving this problem. A selection approach based on the Borda count method is adopted in order to make the right decision as to which keypoint-based method is to be used in the proposed reconstruction system. The selection approach was determined the Accelerated-KAZE (AKAZE) as the most successful keypoint-based method. This study also proposes new measures determining the success ratio of the reconstructed banknotes and calculating their loss ratio. When the torn banknotes were reconstructed with the AKAZE-based reconstruction system, the average success rate was calculated as 95.55% by the proposed metric
    corecore